20 research outputs found

    Music Information Retrieval Meets Music Education

    Get PDF
    This paper addresses the use of Music Information Retrieval (MIR) techniques in music education and their integration in learning software. A general overview of systems that are either commercially available or in research stage is presented. Furthermore, three well-known MIR methods used in music learning systems and their state-of-the-art are described: music transcription, solo and accompaniment track creation, and generation of performance instructions. As a representative example of a music learning system developed within the MIR community, the Songs2See software is outlined. Finally, challenges and directions for future research are described

    Improving semi-supervised learning for audio classification with FixMatch

    Get PDF
    Including unlabeled data in the training process of neural networks using Semi-Supervised Learning (SSL) has shown impressive results in the image domain, where state-of-the-art results were obtained with only a fraction of the labeled data. The commonality between recent SSL methods is that they strongly rely on the augmentation of unannotated data. This is vastly unexplored for audio data. In this work, SSL using the state-of-the-art FixMatch approach is evaluated on three audio classification tasks, including music, industrial sounds, and acoustic scenes. The performance of FixMatch is compared to Convolutional Neural Networks (CNN) trained from scratch, Transfer Learning, and SSL using the Mean Teacher approach. Additionally, a simple yet effective approach for selecting suitable augmentation methods for FixMatch is introduced. FixMatch with the proposed modifications always outperformed Mean Teacher and the CNNs trained from scratch. For the industrial sounds and music datasets, the CNN baseline performance using the full dataset was reached with less than 5% of the initial training data, demonstrating the potential of recent SSL methods for audio data. Transfer Learning outperformed FixMatch only for the most challenging dataset from acoustic scene classification, showing that there is still room for improvement

    Mit maschinellem Lernen zur intelligenten Produktion: Vortrag gehalten auf dem Technologietag, 9. Oktober 2018, Erfurt

    No full text
    Welches Potenzial Maschinelles Lernen für Produktionsprozesse birgt, erklärte Machine-Learning-Wissenschaftler Sascha Grollmisch vom Fraunhofer IDMT den Zuhörern. »Der Einsatz maschineller Lernverfahren ist ein mächtiges Werkzeug, wenn es darum geht, eine Produktionsstrecke intelligent zu machen«, so Grollmisch. Systeme des Maschinellen Lernens wurden am Fraunhofer IDMT bisher sehr erfolgreich in der Videoanalyse, Spracherkennung und Musikanalyse angewendet. Ziel der Entwicklungsarbeiten des Instituts sei es, ein System zu schaffen, welches eigenständig aus akustischen Messdaten lernt, die Qualität von Produktionsprozessen oder Produkten zu beurteilen. Grollmisch betonte dabei, dass es keine Universal-Erkenner-Lösung für alle Anwendungsszenarien gäbe. Vielmehr bedarf es einem individuellen Problemverständnis sowie Systemdesign zum sinnvollen Einsatz von geeigneten Machine Learning Methoden. Konsens der anschließenden Diskussion war, dass Vertrauen der Kunden in die Zuverlässigkeit von Machine Learning Verfahren und deren Ergebnisse geschaffen werden muss

    Automatic Chord Recognition in Music Education Applications

    No full text
    In this work, we demonstrate the market-readiness of a recently published state-of-the-art chord recognition method, where automatic chord recognition is extended beyond major and minor chords to the extraction of seventh chords. To do so, the proposed chord recognition method was integrated in the Songs2See Editor, which already includes the automatic extraction of the main melody, bass line, beat grid, key, and chords for any musical recording

    Hörbare Fehler. Überwachung von Maschinen und Produkten anhand akustischer Signale

    No full text
    Das menschliche Gehör ist erstaunlich leistungsfähig. Oftmals kann man bereits am Geräusch erkennen, ob ein Bauteil funktionsfähig oder fehlerhaft ist. Wissenschaftler am Fraunhofer IDMT versuchen Maschinen dieses "Hören" beizubringen, um mangelhafte Produkte während der Produktion zu identifizieren. Damit soll ein Beitrag zur automatischen Qualitätssicherung geleistet werden

    Akustische Qualitätskontrolle mit künstlicher Intelligenz

    No full text
    Spätestens mit dem Zusatz »4.0« hat die Industrie in den vergangenen Jahren den Prozess der Digitalisierung gestartet und setzt nun die voranschreitende Automatisierung von Produktionsprozessen um. Die Vernetzung von Anlagen und Komponenten steht dabei im Vordergrund - Sensoren übernehmen zunehmend die Funktion der menschlichen Sinne und liefern dadurch vielfältige Potentiale zur Optimierung der Wertschöpfungskette. Entwicklungen aus den Bereichen Sensorik und Messtechnik bilden in Kombination mit Fortschritten bei maschinellen Lernverfahren die Grundlage für innovative Möglichkeiten der Automatisierung von Prozessen. In diesem Zusammenhang entwickelt das Fraunhofer-Institut für Digitale Medientechnologie IDMT aus Ilmenau akustische Verfahren zur Sicherstellung der Qualität von Prozessen und Produkten, welche auf unterschiedliche Anwendungen in der industriellen Produktion angepasst werden können. Dieser Artikel stellt die einzelnen Komponenten, welche für eine integrierte Systemlösung zur Luftschallanalyse notwendig sind, anhand eines Anwendungsfalls vor

    IDMT-ISA-Compressed-Air Dataset

    No full text
    The IDMT-ISA-Compressed-Air (IICA) dataset aims to foster research in compressed air leak detection with acoustic emissions in the audible hearing range with recordings of air leaks in a simulated industrial compressed air network. The dataset contains recordings of multiple leak types with different types of industrial background noises played via external loudspeakers at two different volumes during the recording process. Leak Types: Vent Leak Vent Leak Low Pressure Tube Leak Noise Types: Lab Noise (no added background noise) Hydraulic machine noise Hydraulic machine noise, low volume General factory workshop noise General factory workshop noise, low volume For each combination of leak and noise types, there were three recording sessions. During each session, four Earthworks M30 omnidirectional measurement microphones placed in different configurations recorded the acoustic emission of the compressed air network. Each recording session contains 128 files of 30 seconds each, corresponding to each combination of leak, noise and microphone. Total Files: 5592 Sampling Rate: 48 kHz Resolution: 32-bit Mono Audio See the above referenced paper and README contained with the data folder for further details

    IDMT-SMT-Chords Dataset

    No full text
    The IDMT-SMT-CHORDS comprises of 16 MIDI generated audio files consists of various chord classes. Here we focused on chord voicings, which are commonly used on keyboard instruments and guitars. Based on this we categorized as Guitar and Non-Guitar instruments. We used several software instruments from Ableton Live and Garage Band to synthesize these MIDI files with various instruments such as piano, synthesizer pad, as well as acoustic and electric guitar. File duration: 4.1 Hours # Chord segments: 7398 # WAV files: 16 Chord duration: 2 seconds BPM: 120 Time signature: 4/4 Sampling rate: 44.1KHz Mono audio Non-Guitar The Non-Guitar files includes all chord types in all possible root note positions and inversions. For example, C Major triad chord is included with its two possible inversions C/E and C/G. All non-guitar chord classes are listed below: Major (+ 2 inversions) Minor (+ 2 inversions) Major 7 (+ 3 inversions) Minor 7 (+ 3 inversions) Power Chord - root and fifth note (+ 1 inversion) Dominant 7 (+ 3 inversions) Minor 7 flat 5 (+ 3 inversions) This gives us 576 non-guitar chord classes. Guitar The guitar files where generated based on barŕe chord voicings with the root note located on the low E, A, and D strings. For example, to modeling major chord and it’s voicings we use open position E maj, A maj and D maj shape and move 12 steps (including octave at 12th fret) thereby we get 39 positions (13*3). List of Guitar chord types: Major (+ 2 voicings) Minor (+ 2 voicings) Major 7 (+ 2 voicings) Minor 7 (+ 2 voicings) Power Chord - root and fifth note (+ 2 voicings) Dominant 7 (+ 2 voicings) Minor 7 flat 5 (+ 2 voicings) This gives us 273 different guitar chord classes
    corecore